Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🔀 SIMD Programming
Specific
Vectorization, Parallel Computing, CPU Instructions, Performance
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
121575
posts in
26.4
ms
APL
Performance
📏
Linear Types
aplwiki.com
·
3d
·
Hacker News
·
…
A Unified Performance-Cost Landscape of
Parallel
p-bit
Ising
Machines Based on Update Dynamics
⚡
Incremental Computation
arxiv.org
·
1h
·
…
Metal Quantized Attention: pulling M5 Max ahead with
Int8
matrix
multiplication
⚡
Hardware Acceleration
releases.drawthings.ai
·
1d
·
Hacker News
·
…
Building a Free AI Image Generator on 7
GPUs
: Architecture Deep
Dive
🎮
WebGPU
dev.to
·
16h
·
DEV
·
…
Intel
Binary
Optimization Tool Changes Code Execution with Heavy
Vectorization
📊
Profiling Tools
techpowerup.com
·
2d
·
…
Accelerate CPU-based AI inference workloads using Intel
AMX
on Amazon
EC2
🔢
Intel AMX
aws.amazon.com
·
3d
·
…
abdimoallim/psimd
: A portable, header-only SIMD library for C (SSE2, SSE4.1, AVX/AVX2+FMA, NEON/AArch64, WebAssembly
SIMD128
, scalar fallback)
🔢
AVX-512
github.com
·
1d
·
r/C_Programming
·
…
'Performance without compromise': AMD debuts first dual 3D V-Cache Ryzen CPU in potential showdown against
Threadripper
and
EPYC
siblings
⚡
Hardware Acceleration
techradar.com
·
2d
·
…
Iteratively
optimizing an
SPSC
queue
⭕
Ring Buffers
blog.c21-mac.com
·
4d
·
r/cpp
·
…
Supercharging
Redpanda
Streaming with profile-guided optimization
🚀
Performance
redpanda.com
·
1d
·
…
MXFP8
GEMM: Up to 99% of
cuBLAS
Performance Using CUDA and PTX
🧩
mimalloc
danielvegamyhre.github.io
·
5d
·
Hacker News
·
…
Why I’m Building a
Database
Engine in C#
🔨
Incremental Compilation
nockawa.github.io
·
6d
·
Hacker News
·
…
CuTeGen
: An LLM-Based Agentic Framework for Generation and Optimization of High-Performance GPU
Kernels
using CuTe
🎮
WebGPU
arxiv.org
·
1h
·
…
facebookincubator/dispenso
: The project provides high-performance concurrency, enabling highly parallel computation.
🌀
Naiad
github.com
·
1d
·
Hacker News
·
…
HACache
: Leveraging Read Performance with Cache in a Heterogeneous
Array
🔁
Cache Coherence
arxiv.org
·
1h
·
…
Performance &
Recursion
🌳
Instruction Selection
dev.to
·
4d
·
DEV
·
…
Generative
Profiling
for Soft Real-Time Systems and its Applications to Resource
Allocation
⚙️
Performance Profiling
arxiv.org
·
1h
·
…
m0at/rvllm
:
rvLLM
: High-performance LLM inference in Rust. Drop-in vLLM replacement.
📊
Criterion.rs
github.com
·
5d
·
Hacker News
·
…
Adaptive Parallel
Monte
Carlo
Tree Search for Efficient Test-time Compute Scaling
⚡
X-Fast Tries
arxiv.org
·
1d
·
…
AXON: An Automated
Netlist
Optimization Framework for High-Speed
Adders
🚀
Superoptimization
arxiv.org
·
3d
·
…
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help